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Abstract 

A classical limit theorem of stochastic process theory concerns the sample 
cumulative distribution function (CDF) from independent random variables. 
If the variables are uniformly distributed then these centered CDFs converge 
in a suitable sense to the sample paths of a Brownian Bridge. The so-called 
Hungarian construction of Komlos, Major and Tusnady provides a strong form 
of this result. In this construction the CDFs and the Brownian Bridge sample 
paths are coupled through an appropriate representation of each on the same 
measurable space, and the convergence is uniform at a suitable rate. 

Within the last decade several asymptotic statistical-equivalence theorems 
for nonparametric problems have been proven, beginning with Brown and Low 
(1996) and Nussbaum (1996). The approach here to statistical-equivalence is 
firmly rooted within the asymptotic statistical theory created by L. Le Cam 
but in some respects goes beyond earlier results. 

This talk demonstrates the analogy between these results and those from 
the coupling method for proving stochastic process limit theorems. These two 
classes of theorems possess a strong inter-relationship, and technical methods 
from each domain can profitably be employed in the other. Results in a 
recent paper by Carter, Low, Zhang and myself will be described from this 
perspective. 

1. Probability setting 

1.1. Background 

Let F be the CDF for a probability on [0,1];. F abs. cont., with 

r) F 

m = ^ on [0,1]. 
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Let Xi, . . . , X„ iid from F. F„ denotes the sample CDF, 

1 " 

P n(x) = -^2l M (Xj). 

i=i 

Let ^ n denote the corresponding sample "bridge" , 

Z n (x) = F n (x)-F(x) (1) 

Let W(t) denote the standard Wiener process on [0,1] and let W n denote the 
white noise process with drift f and local variance / ^)/ n . Thus W n solves 



dW n (t) = f(t)dt + J ^-dW (t) . 

V n 

An alternate description of W n is that it is the Gaussian process with mean 
F(t) and independent increments having 



var (W n (t) ~ W n (s)) = i (F(t) - F(s)) , f 



or < s < t < 1. 



The analog of Z n is the Gaussian Bridge, defined by 

There arc various ways of describing the stochastic similarity between Z n and 
B n . For example Komlos, Major, and Tusnady (1975, 1976) proved a result of the 
form 

Theorem (KMT): Given any absolutely continuous F {X\,. . . ,X n } can be 
defined on a probability space on which B n can also be defined as a (randomized) 
function of {X\,. . . ,X n }. This can be done in such a way that B n has the Gaussian 
Bridge distribution, above, and 

P F ( sup Z n (t) - B n (t) >a n ) <c. (2) 
V*e[o,i] J 

Here c > and a n are suitable positive constants with a n ~ (d log n) / y/n for 
some d > 0. The process B n can be constructed as a (randomized) function of Z n , 
that is, B n {t) = Q n (^Z n (t)j . It should be noted that the construction depends on 
knowledge of F. 

[Various authors, such as Csorgo and Revesz (1981) and Bretagnolle and Mas- 
sart (1989) have given increasingly detailed and precise values for a n and c = c(a n ), 
and also uniform (in n) versions of (||). These are not our focus.] 
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1.2. Extensions 

1. Results like the above also extend to functional versions of the process Z n . 
Various authors including Dudley (1978), Massart (1989) and Koltchinskii (1994) 
have established results of the following form. 

Let q:[0,l]— > 5R be of bounded variation. One can define 

Z n (q) = J qd{F n -F}= J (F-F^dq. 

(Thus, Z n (x) = Z n (ijo.z]) .) There is a similar definition for B n {q) as a 
stochastic integral. (See, for example, Steele (2000).) Then the KMT theorem 
extends to a fairly broad, but not universal, class of functions, Q. That is, for each 
F, B n can be defined to satisfy 

Pf ( supv^ Z n (q) - B n (q) > a' n ) <c where {a' n } depends on Q. (3) 
V«eQ / 



(For most classes Q, a n V n /\ g n — > oo so that a' n >> a n -) 

2. Bretagnolle and Massart (1989) proved a similar result for inhomogeneous 
Poisson processes. Let {T lv . . ,T^r} be (ordered) observations from an inhomoge- 
neous Poisson process with cumulative intensity function nF and, correpondingly, 
(local) intensity nf. Note that N~Poisson(n) and conditionally given N the values 
of {Ti,. . . ,Tat} are the order statistics corresponding to an iid sample from the 

( N ) 

distribution F. In this context we continue to define F n (t) = n 1 < I[ot](Tj) > 

I' 1 ' J 

where the term in braces now has a Poisson distribution with mean nF(t). Also, 

continue to define Z n (t) = F n (t) — F(t) as in (pi). (But, note that it is no longer 
true that ^(Q) = 0, w.p.l, as was the case in (|Tp.) 

Then versions of the conclusions (Q) and (^[) remain valid. We give an explicit 
statement since this result will provide a model for our later development. 

Theorem (BM): Given any n and any absolutely continuous F the obser- 
vations { Ti,. . . , TV} of the inhomogeneous Poisson process can be defined on a 
probability space on which B n can also be defined as a (randomized) function of 
{ T\, . . . , Tjv } . This can be done in such a way that B n has the Gaussian Bridge 
distribution, above, and 

P F ( sup Z n {t) - B n {t) >a n ) <c. (4) 
V*e[o.i] / 

Here c > and a n are suitable constants with a n ~ dlogn/ ^/n. 

Remark: Clearly there must be extensions of (^) that are valid for the Poisson 
case also, although we are not aware of an explicit treatment in the literature. Such 
a statement would conclude in this setting that 



Pf ( supv^ Z n (q) — B n (q) > a' n I < c where {a' n } depends on Q. 
\<?6Q / 



(5) 
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2. Main results 

The objective is a considerably modified version of (||) and (^J) that is stronger 
in several respects and (necessarily) different in others. We will concentrate for most 
of the following on the statement (|5|) since our results are slightly stronger and more 
natural in this setting. The extension of (|3|) will be deferred to a concluding Section. 

Expression (j^) involves the target function B n . In the modified version the 
role of target function is instead played by W n which is the solution to the stochastic 
differential equation 

dW n (t) = g(t)dt + -^=dW{t) (6) 



where g(t) = \J f(t). An alternate description of W n is thus 



W n = G(t) + W(t)/(2y/n) where G(t) = / y/Wjdr. (7) 



(In the special case where / is the uniform density, /=1, then W n = W4 n .) 

The role of the constructed random process Z n is now played by a differently 
constructed process Z n . As before Z n depends only on {Ti,...Tjv}, and not oth- 
erwise on their CDF, F. This version also involves a large set, J 7 , of absolutely 
continuous CDFs. Both Z n and T will be described later in more detail. Here are 
statements of the main results. 

Theorem 1: Let T he a set of densities satisfying Assumption A or A', 
below. Let Q be the set of all functions of bounded variation. Let {T\,. . . , TV} be 
an inhomogeneous Poisson process with local intensity nf. The process Z n can be 
constructed as a (randomized) function of {T\,. . . ,Tn}, with the construction not 
depending on f. The Gaussian process W n having the distribution (j^ ) can also 
be defined on this same space as a (randomized) function of {T\ ; . . . ,Tjy}. [This 
construction depends on f on a set of probability at most c n .] This can be done in 
such a way that 



sup P f sup Z n {q) - W n (q) > < -> 0. (8) 



To be more precise, the phrase in brackets refers to the fact that there is a basic 
construction, independent of /, and that this construction must then be modified 
on a set of measure at most c„ with this set and the modification depending on /. 

For the situation of iid variables, as in (|l|), a similar result holds. In this case 
the matching Gaussian process is again W n , rather than the Brownian bridge of the 
KMT theorem. 

Theorem 2: Let J- be a set of densities satisfying Assumption B, below. Let Q 
be the set of all functions of bounded variation. Given any n and /€ T , iid variables 
{Xi,. . . ,X n } with density f can be defined on a probability space. A process Z n can 
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be constructed as a (randomized) function of {X\,. . . ,X n }, with the construction 
not depending on f . The Gaussian process W n having the distribution (Qj can also 
be defined on this same space as a (randomized) function of {X\,. . . ,X n }. [This 
construction depends on f, but only on a set of probability at most c n .] This can be 
done in such a way that 



sup P f ( sup Z n (q) - W n {q) 
fer VgeQ 



> < c T{ 



0. 



(9) 



3. Statistical background 
3.1. Settings 

The hrst purpose of the discussion here is to motivate the probabilistic results 
described above. A second purpose is to state the result on which to base the proof 
of Theorem 1. The setting involves two statistical formulations: 

Formulation 1 (nonparametric inhomogeneous Poisson process): The observa- 
tions are T = {Ti,. . . ,Tn} from the Poisson process with local intensity n/, / € T. 
The problem is "nonparametric" because the "parameter space" , T , is a very large 
set - too large to be smoothly parameterized by a mapping from a (subset of) a 
finite dimensional Euclidean space. Some possible forms for T are discussed below. 
The statistician desires to make some sort of inference, <5, (possibly randomized) 
based on the observation of X. 

Formulation 1' (nonparametric density with random sample size) : The relation 
between Poisson processes and density problems has been mentioned above. As a 
consequence, Problem 1 is equivalent to a situation where the observations are 
{Xi,...,Xiv} with N~Poisson(n) and {Xi,...,Xjv} the order statistics from a sample 
of size N from the distribution with density /. Clearly, this situation is closely 
related to the more familiar one in which the observations are {Xi,...,X„} with n 
specified in advance. 

Formulation 1" (nonparametric density with fixed sample size): This formu- 
lation refers to the more conventional density setting in which the observations are 
{Xi,...,X„} iid with density /. 

Formulation 2 (white noise with drift): The statistician observes a White noise 
process dW n (t), t(=[0,l], with drift g€ Q and local variance l/4n. Thus 

dW n (t)=g(t)dt+-^dW(t), 

and W n (t) - G(t) = ^tyfe/^ where G(t) = J g(r)dr. Again Q is a very large 

o 

- hence "nonparametric" - parameter space. Throughout, Q C Ci = {g : / g 2 < 
oo}. As of now, there need be no relation between / in Formulation 1 and g in 
Formulation 2, but such a relation will later be assumed in connection with Theorem 
1, where 

g= V7 &ndG = {s/f:feT}. (10) 



562 



Lawrence D. Brown 



This can alternatively be considered as a statistical formulation having param- 
eter space T under the identification (|l0|). We take this point of view in the BCLZ 
theorem, below. 



3.2. Constructive asymptotic statistical equivalence 

Here is one definition of the strongest form of such an equivalence. 



Definition (asymptotic equivalence): Let V ■ 



(n) 



(n) (n) (n) 
J ' J ' 3 ' J 



1,2, n = 1,2,... be two sequences of statistical problems on the same sequence of 
parameter spaces, 8^, Hence, = ■ ^ ^ 0^™^. Then fli and Il^are 

asymptotically equivalent if there exist (randomized) mappings : X- — > X^ , 
j, k = 1, 2, k^j, such that 



sup 



0,j,k = 1,2, k^j, (11) 



\TV 



TV 



denotes the total variation norm. 



where 

This definition involves a reformulation of the general theory originated by 
LcCam (1953, 1964). See also Le Cam (1986), Lc Cam and Yang (2000), van dcr 
Vaart (2002) and Brown and Low (1996) for background on this theory includ- 
ing several alternate versions of the definition and related concepts, a number of 
conditions that imply asymptotic equivalence, and many applications to a variety 
of statistical settings. Note that both Formulations 1 and 2 involve an index, n, 
and can thus be considered as sequences of statistical problems in the sense of the 
definition. 



3.3. Spaces of densities (or intensities) 

Suitable families of densities, T , can be defined via Besov norms with respect 
to the Haar basis. The Besov norm with index a and shape parameters p = q can 
most conveniently be defined via the stepwise approximants to / at resolution level 
k. These approximants arc defined as 

(m)/ 2 * ^ 

fk(t) = Yl 7 t^/2*,(m)/2 fe )(*) / 2fc /' 

e/2 k 

and the Besov(a,p) norm is defined as 

{oo 
|/o| P + ^2^||/ fc -/ fc+1 
fc=0 

The statement of Theorem 1 can now be completed by stating the assumption 
on T needed for its validity. 
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Assumption A: T satisfies 



T C \ f : iof i f{x) > £ \ for some e > (12) 



and J- is compact in both Besov(l/2,2) and Besov(l/2,4)- 

Other function spaces are also conventional for nonparametric statistical ap- 
plications of this type. The most common of these are based on either the Lipshitz 
norm ||/||j 3 L ' 1 or the Sobolev norm ||/||^. These are defined for (3 <1 by 

11/11?' = sup ii/nf =|W fc 

o<x< v <i \y-xf 



l 

where t3k = J f(x)e l x dx denote the Fourier coefficients of f. (Both spaces have 
o 

natural definitions for fill as well, but we need consider here only the case (3 < 1.) 

The following implies Assumption A and hence also suffices for validity of 
Theorem 1. 

Assumption A': T satisfies (|7^) ; and is bounded in the Lipshitz norm with 
index [3, and is compact in the Sobolev norm with index a, where a > (3 and either 
(3 > 1/2 or a > 3/4 and a + j3 > 1. 

The following assumption is noticeably stronger than either A' or A, and is 
used in Theorem 2. 

Assumption B: T satisfies (li) and is bounded in the Lipshitz norm with 
index (3, where (3 > 1/2. 

For more information about the relation of these spaces in this context see 
Brown, Cai, Low and Zhang (2002) and Brown, Carter, Low and Zhang (2002) 
(referred to as BCLZ below). 



3.4. Statistical equivalence theorems 

BCLZ then extended earlier results of Nussbaum (1996) and Klcmela and 
Nussbaum (1998) to prove the following basic result: 

Theorem a (BCLZ): Consider the statistical Formulations 1 and 2 with the 
parameter space T and the relation JT^j. Assume T satisfies Assumption A (or 
A \ Then the sequences of statistical problems defined in these two formulations are 
asymptotically statistically equivalent. 

BCLZ describes in detail a construction of Z n as a (randomized) function of 
{Ti,. . . ,T„}. (More precisely, BCLZ describes the construction of the Haar basis 
representation of Z n , from which Z n can directly be recovered.) This construction 
is invertible, in that {Ti,. . . ,T„} can be recovered as a function of Z n . Further, 
BCLZ shows that both Z n and W n can be represented on the same probability space 
so that their distributions, and , say, satisfy 

II P~ — P~ II ^0 
lr z n r w n Wtv 
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The mappings {Q^: j=l ,2, n = 1,2,...} that yield the equivalence of the above 
theorem can then be directly inferred from this construction. To save space here 
we refer the reader to that paper or Brown (2002) for details of the construction 
and proof. It can be remarked that these bear considerable similarity to parts of 
the construction and proof in Bretagnolle and Massart (1989) and other proofs of 
KMT type theorems. But there are also some basic differences, especially those 
related to the appearance of the square-root in the fundamental relation ( |Io|) and 
the total variation norm in the definition of equivalence. In addition, the fact that 
@ is uniform in Q and T entails the need for various refinements in the proof. 

Theorem 1 is now an immediate logical consequence of this result from BCLZ 
and the following lemma. 

Lemma: Suppose V- = (Xj >™j > •> j ) J — 1>°> n — are asymp- 

totically equivalent sequences of statistical problems on the same sequence of param- 
eter spaces, 0(") . Let {Q^: j=l,2, n=l,2,...} denote a sequence of mappings that 

define this equivalence, as in (jil[). Then there are non-randomized mappings {Qj : 
3=1,2, n = 1,2,...} such that 

Pf{Qf ] = Q$ n) ) > 1 - c n for every f G T^K J = 1, 2, n = 1, 2, ... (13) 
and for every 9 G 0W 

p t,.* ($5 B) e A ) = p f^ ( x l n) e A ) > 9 e Q(n) ( 14 ) 

(n) 

for every measurable A c X^ , j, k = 1, 2, j ^ k, n = 1, 2, • ■ ■ . 

Proof of Lemma: Fix n, j, k^j, 9 G Q^ n >. Let Fj~ denote the distribution 
under 9 of X k and let F k denote the distribution under 9 of [x\ ). Let 



, F' k ). Let oo 
randomized map satisfying 



H = min(Ffc, F' k ). Let oo > f' k = ^j- > 1. Then define Qj as a version of the 



Qj{B\x) = -jn—:Qj(B\x) + f -±—- {F' k (B) - H{B)) . 

Jk\ x ) Jk 

This completes the proof of the lemma, and consequently also that of Theorem 

1. □ 

Theorem 2 requires a slightly different fundamental result. The following result 
is the foundation for the proof of Theorem 2. It is adapted from Theorem 2 of BCLZ. 
This result closely resembles Theorem a, above, but as noted in BCLZ it appears to 
require a modified construction for its proof. The argument there is based heavily 
on results in Carter (2001). 

Theorem b (BCLZ): Consider the statistical Formulations 1" and 2 with 
the parameter space T and the relation (jT^j. Assume T satisfies Assumption B. 
Then the sequences of statistical problems defined in these two formulations are 
asymptotically statistically equivalent. 
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